Computer simulations of language change notes

This website collects my personal notes on Computer simulations of language change. These notes are provided to bring full transparency to my research process. Of course, since they are only notes, they do not reflect my final thoughts on a topic, and should not be interpreted as such. To read finished papers, please consult my website. Do not use these notes as a basis for your own scientific research. Start from high-quality, peer-reviewed scientific literature instead.

Frequency effects in language learning and processing

p. 1

Introduction

Stefan Th. Gries

p. 7

What can we count in language, and what counts in language acquisition, cognition and use?

Nick C. Ellis

p. 1

Summary

Ellis discusses the interrelation of frequency and cognition – in cognition in general as well as in (second) language cognition – and, most importantly given current discussions in usage-based approaches to language, provides a detailed account of the factors that drive the kind of associative learning assumed by many in the field: type and token frequency, Zipfian distributions as well as recency, salience, perception, redundancy etc. Just as importantly, Ellis derives a variety of conclusions or implications of these factors for our modeling of learning and acquisition processes, which sets the stage for the papers in this volume.

p. 7

Frequency and cognition

Cognition

three major experiential factors that affect cognition

frequency
recency
context

↓

The more times we experience something, the stronger our memory for it, and the more fluently it is accessed.
The more recently we have experienced something, the stronger our memory for it, and the more fluently it is accessed. (Hence your more fluent reading of the prior sentence than the one before).
The more times we experience conjunctions of features, the more they become associated in our minds and the more these subsequently affect perception and categorization; so a stimulus becomes associated to a context and we become more likely to perceive it in that context.

p. 7-8

percept

a complex state of consciousness in which antecedent sensation is supplemented by consequent ideas which are closely combined to it by association

p. 8

Categorisation

categorisation

testament to human ‘tallying’ (Ellis 2002)

↓

fuzzy natural categories

e.g. Wittgenstein (1953) → organised through family resemblances

[S]on may be like mother, and mother like sister, but in a very different way. And we learn about these families, like our own, from experience. Exemplars are similar if they have many features in common and few distinctive attributes (features belonging to one but not the other); the more similar are two objects on these quantitative grounds, the faster are people at judging them to be similar (Tversky 1977).

prototypes

exemplars which are most typical of a category
similar to many members of the category, but not similar to members of other categories
(Rosch & Mervis 1975, Rosch et al. 1976)

Prototypes are judged faster and more accurately, even if they themselves have never been seen before […] Such effects make it very clear that although people don’t go around consciously counting features, they nevertheless have very accurate knowledge of the underlying frequency distributions and their central tendencies.

p. 9

Frequency and language cognition

Why frequency is important

Language processing is very sensitive to usage frequency
↓
at all levels of language representation!

↳ (Ellis 2002)

phonology and phonotactics
reading
spelling
lexis
morphosyntax
formulaic language
language comprehension
grammaticality
sentence production
syntax

Language knowledge involves statistical knowledge, so humans learn more easily and process more fluently high frequency forms and ‘regular’ patterns which are exemplified by many types and which have few competitors.

language learning according to psycholinguists

implicit associative learning of representations that reflect the probabilities of occurrence of form-function mappings

the rules of language

structural regularities
emerge from analysis of distributional characteristics of language inplut
≈ what AI models are doing nowadays

Theoretical framework

usage-based linguistics

a theoretical model
‘we learn linguistic constructions while engaging in communication’

constructions

form-meaning mappings
conventionalised in the speech community
entrenched as language knowledge in the learner’s mind

p. 10

Goldberg’s (2006) Construction Grammar argues that all grammatical phenomena can be understood as learned pairings of form (from morphemes, words, idioms, to partially lexically filled and fully general phrasal patterns) and their associated semantic or discourse functions: ‘‘the network of constructions captures our grammatical knowledge in toto, i.e. It’s constructions all the way down’’

Frequency and Second Language Acquisition

skipped

p. 11

Construction learning as associative learning from usage

constructions

form-function mappings
the ‘units’ of language

↓ then…

language acquisition

inducing associations between form and function from experience of language usage

↳ how?

distributional analysis of the language stream
parallel analysis of contingent perceptual activity
abstract constructions? → learnt from the conspiracy of concrete exemplars
statistical learning mechanisms (Christiansen and Chater 2001)

Determinants of learning

1. input frequency

type-token frequency
Zipfian distribution
recency

2. form

salience
perception

3. function

prototypicality of meaning
importance of form for message comprehension
redundancy

4. interactions between all

contingency of form-function mapping

Input frequency

Construction frequency

frequency of exposure

promotes learning

↓

phonology and phonotactics, reading, spelling, lexis, morphosyntax, formulaic language, language comprehension, grammaticality, sentence production, syntax

↳ sensitivity to input frequencies

shows that language users actually register these frequencies
⇒ evidence for usage-based models of language acquisition (and processing)

p. 12

Type and token frequency

token frequency	type frequency
how often a particular form appears in the input	the number of distinct lexical items that can be substituted in a given slot in a construction*

* can be both a word-level construction for inflection or a syntactic construction

Type frequency

type

licenses the productivity of phonological, morphological and syntactic patterns
why? 👁 ↓

the more lexical items that are heard in a certain position in a construction, the less likely it is that the construction is associated with a particular lexical item and the more likely it is that a general category is formed over the items that occur in that position;
the more items the category must cover, the more general are its criterial features and the more likely it is to extend to new items;
high type frequency ensures that a construction is used frequently, thus strengthening its representational schema and making it more accessible for further use with new items

(so, in conclusion, it’s just cognitively advantageous, and thus easy to defend this position)

Token frequency

token

promotes entrenchment / conservation of irregular forms and idioms

Irregular forms only survive because they are high frequency.

p. 13

Zipfian distribution

Zipf’s law (1949)

“in human language, the frequency of words decreases as a power function of their rank in the frequency table”
many language events (e.g., frequencies of phoneme and letter strings, of words, of grammatical constructs, of formulaic phrases, etc.) across scales of analysis follow this law (Ferrer i Cancho and Sole´ 2001, 2003)

learning categories from exemplars

acquisition is optimized by the introduction of an initial, low-variance sample centered upon prototypical exemplars (Elio and Anderson 1981, 1984)
allows learners to get a fix on what will account for most of the category members

p. 14

Recency

recency effects / priming

observed in phonology, conceptual representations, lexical choice and syntax (Pickering and Ferreira 2008)

↓

syntactic priming

the phenomenon of using a particular syntactic structure given prior exposure to the same structure

This behavior has been observed when speakers hear, speak, read or write sentences (Bock 1986; Pickering 2006; Pickering and Garrod 2006).

p. 15

Form (salience and perception)

salience

general perceived strength of stimuli

↳ low salience cues

tend to be less easily learnt

Many grammatical meaning-form relationships, particularly those that are notoriously difficult for second language learners like grammatical particles and inflections such as the third person singular -s of English, are of low salience in the language stream. For example, some forms are more salient: ‘today’ is a stronger psychophysical form in the input than is the morpheme ‘-s’ marking 3^rd person singular present tense, thus while both provide cues to present time, today is much more likely to be perceived, and -s can thus become overshadowed and blocked, making it difficult for second language learners of English to acquire (Ellis 2006, 2008; Goldschneider and DeKeyser 2001).

Function

Prototypicality of meaning

central category members

some members of categories are more typical of the category than others → show the family resemblance more clearly

↓

prototype

the ‘best’ example of the category
summarises the most representative attributes of category

p. 16

↳ token frequency

very important contributor to the centrality of the prototype

Redundancy

redundant cues

tend not to be acquired
also found in the Rescorla-Wagner model (1972)

Not only are many grammatical meaning-form relationships low in salience, but they can also be redundant in the understanding of the meaning of an utterance. For example, it is often unnecessary to interpret inflections marking grammatical meanings such as tense because they are usually accompanied by adverbs that indicate the temporal reference. Second language learners’ reliance upon adverbial over inflectional cues to tense has been extensively documented […]

Interactions between these (contingency of form-function mapping)

contingency of mapping (Shanks 1995)

important

p. 16-17

Consider how, in the learning of the category of birds, while eyes and wings are equally frequently experienced features in the exemplars, it is wings which are distinctive in di¤erentiating birds from other animals. Wings are important features to learning the category of birds because they are reliably associated with class membership, eyes are neither. Raw frequency of occurrence is less important than the contingency between cue and interpretation.

p. 17

The many aspects of frequency and their research consequences

p. 18

[W]hat we really want is a model of usage and its effects upon acquisition. We can measure these factors individually. But such counts are vague indicators of how the demands of human interaction affect the content and ongoing co-adaptation of discourse, how this is perceived and interpreted, how usage episodes are assimilated into the learner’s system, and how the system reacts accordingly. We need theoretical models of learning, development, and emergence that takes these factors into account dynamically.

Language learning as estimation from sample: implications for instruction

(skipped)

p. 20

Exploring what counts

Not everything that we can count in language counts in language cognition and acquisition

If it did, the English articles the and a alongside frequent morphological inflections would be among the first learned English constructions, rather than the most problematic in L2A.

associative learning affected by…

1. factors relating to the form

i.e. frequency, salience

2. factors relating to learner attention

i.e. automaticity, transfer, blocking

↳ raw frequency counts are too simple (that’s the idea)

p. 21

Emergentism and complexity

emergentism

general framework
quantitative, multivariate, multi-agent

Agents

agents everywhere

“from neuron, through self, to society”
⇒ language emergence as a function of interactions within and between them

[M]ore recently, work within Emergentism, Complex Adaptive Systems (CAS), and Dynamic Systems Theory (DST) has started to describe a number of scale-free, domain-general processes which characterize the emergence of pattern across the physical, natural, and social world

Complexity theory

Emergentism and Complexity Theory (MacWhinney 1999; Ellis 1998; Elman et al. 1996; Larsen-Freeman 1997; Larsen-Freeman and Cameron 2008; Ellis and Larsen-Freeman 2009, 2006)

idea

how do complex patterns emerge from the interactions of many agents?

p. 22

‘‘Emergentists believe that simple learning mechanisms […] suffice to drive the emergence of complex language representations.’’ (Ellis 1998, p. 657)

Complex adaptive system

Language considered as a CAS of dynamic usage and its experience involves the following key features

The system consists of multiple agents (the speakers in the speech community) interacting with one another.
The system is adaptive, that is, speakers’ behavior is based on their past interactions, and current and past interactions together feed forward into future behavior.
A speaker’s behavior is the consequence of competing factors ranging from perceptual mechanics to social motivations.

↓

advantage of CAS

provides a unified account of seemingly unrelated linguistic phenomena (Holland 1998, 1995; Beckner et al. 2009)

variation at all levels of linguistic organization
the probabilistic nature of linguistic behavior
continuous change within agents and across speech communities
the emergence of grammatical regularities from the interaction of agents in language use
stage-like transitions due to underlying non-linear processes

Much of CAS research investigates these interactions through the use of computer simulations (Ellis and Larsen-Freeman 2009).

Zipf, corpora, and complex adaptive systems

p. 23

Principle of Least Effort (Zipf 1949)

reasoning behind Zipf’s law
balancing…
1. speaker effort (optimized by having fewer words to be learned and accessed in speech production)
2. ambiguity of speech comprehension (minimized by having many words, one for each different meaning)

It has become a hallmark of Complex Systems theory where so-called fat-tailed distributions characterize phenomena at the edge of chaos, at a self-organized criticality phase-transition point midway between stable and chaotic domains.

p. 24

Language usage, social roles, language learning, and conscious experience are all socially situated, negotiated, sca¤olded, and guided. They emerge in the dynamic play of social intercourse. All these factors conspire dynamically in the acquisition and use of any linguistic construction. The future lies in trying to understand the component dynamic interactions at all levels, and the consequent emergence of the complex adaptive system of language itself.

p. 35

Are effects of word frequency effects of context of use?

William D. Raymond and Esther L. Brown

p. 2

Summary

Raymond & Brown explore a range of frequency-related factors and their impact on initial fricative reduction in Spanish. They begin by pointing out that results of previous studies have been inconclusive, in part because many different studies have included only partially overlapping predictors and controls; in addition, the exact causal nature of frequency effects has also proven elusive. They then study data on [s]-initial Spanish words from the free conversations from the New Mexico-Colorado Spanish Survey, a database of interviews and free conversations initiated in 1991. A large number of different frequency-related variables is coded for each instance of an s-word, including word frequency, bigram frequency, transitional probability (in both directions), and others, and these are entered into a binary logistic regression to try to predict fricative reduction.

The results show that s-reduction is influenced by many predictors, too many to discuss here in detail. However, one very interesting conclusion is that, once a variety of contextual frequency measures is taken into consideration, then non-contextual measures did not contribute much to the regression model anymore, which is interesting since it forces us to re-evaluate our stance on frequency, from a pure repetition-based view to a more contextually-informed one, which in itself would constitute a huge conceptual development (cf. also below).

p. 35

Introduction

The link between word frequency and reduction

word frequency

the relative cumulative experience that speakers have with words

p. 36

↓

word frequency and reduction

widely studied → word frequency impacts both diachronic change and synchronic production variation

Investigations of the processes of sound change in language going back over a century have noted that more frequent words are shorter and change more quickly than less frequent words (Schuchart 1885; Zipf 1929).

In studies of synchronic pronunciation variation, evidence has been offered that higher word frequency is associated with more word reduction in speech production, as measured by both categorical measures of segment reduction or deletion (Bybee 2001, 2002; Krug 1998; Jurafsky et al. 2001; Raymond, Dautricourt, and Hume 2006) and also continuous measures of reduction, including durational shortening (Gahl 2008; Jurafsky et al. 2001; Pluymaekers, Ernestus, and Baayen 2005) and some acoustic parameters (Ernestus et al. 2006; Myers and Li 2007).

↓

how does frequency of word use contribute to the reductive process?

Footnotes on the relation between frequency and reduction

other factors

lexical structure and class
extra-lexical phonological context
prosodic environment
speech rate
sociolinguistic factors
probabilistic variables
but: usually frequency is predictive at some level

[H]owever, frequency effects are not ubiquitous.

For example, Pluymaekers et al. (2005) found that word frequency affected reduction of affix form and duration for most but not all of the morphologically complex Dutch words they studied.
Similarly, some of the high-frequency function words examined by Jurafsky et al. (2001) had low rates of reduction, despite their high frequency and a control for phonological context.
Finally, Cohn et al. (2005) found no effect of word frequency on durational shortening of homophones, although Gahl (2008) did.

↓ how is this possible?

1. methodological differences

sample size (see Gahl 2008)
the set of factors considered in the study

↓

p. 37

2. likelihood

the likelihood that a word occurs in discourse contexts in a phonological environment that promotes reduction (Bybee 2001, 2002; Timberlake 1978)

For example, the rate of word-final t/d deletion in English is lower for words that are more likely to occur in the context of a following vowel in speech (Guy 1991; Bybee 2002).
Similarly, reduction rates of word-initial [s] in Spanish are higher for words that are more likely to occur in the context of a preceding non-high vowel in speech (Brown 2004, 2006).

Motivation for reduction

automation

reduction is the result of the automation of production processes (Bybee 2001, 2002)
claimed to result in more casual, reduced forms
ultimately: registered as change in lexical representation

↓ ??

nature of reduction

basic frequency of use should occur uniformly across the word
however: it occurs on certain segments or syllables

↓ why?

influence of lexical structure and discourse environments

leads to differential articulatory effects, automation processes (and ultimately, reduction)

⇒ in a study, you have to control for all of these factors

p. 38

In the current study whether word frequency plays an independent role in on-line s- reduction is addressed by controlling both word frequency and frequency of occurrence of a word in phonological environments known to promote articulatory reduction of [s-]. The effects of other probabilistic measures are also assessed, to determine whether they contribute to s- reduction. Both intra- and extra-lexical phonological contexts are controlled, and comparison of their effects is used to determine to what extent reduction can be attributed to lexical representations or on-line articulatory processes.

Data and methods

p. 41

As an illustration of the measures in Table 2, consider the excerpt from the corpus transcription in (1).

. . . a mi sobrino, porque yo . . .
. . . to my nephew, because I . . .

The s- word sobrino in the token in (1) occurs 11 times in the NMCOSS corpus, giving it a frequency per million of 146 and a log frequency of 2.17. The preceding word bigram in this token is mi sobrino, which has a frequency in the corpus statistics of 6, and the frequency of the word preceding the s- word, mi, is 485, so that the predictability of sobrino from mi is 6/485 = .0124. The s- of sobrino in this token is followed (word-internally) by the non-high vowel /o/, which is a context hypothesized to favor s- reduction. However, the vowel preceding s- is the high vowel /i/ in mi, which is hypothesized not to favor reduction. Overall in the corpus the word sobrino occurs after a non-high vowel (/o/, /a/, or /e/) only once, giving sobrino a FFC of 1/11 = .091. The frequency with which /i/ precedes /s/ at a word boundary in the corpus is 407, and the log of this frequency per million phones is 2.61.

p. 44

Results

p. 44-45

main effect of preceding and following phonological contexts of s-
- non-high vowels (before) predict higher reduction rates (2.41 times more likely)
- non-high vowels (after) predict higher reduction rates (3.03 times more likely)
no lexical stress on the initial syllable of an s-word predicts higher reduction (1.59 times more likely)
predictability hardly leads to higher reduction (1.01 times – so 1% – more likely)

p. 45

p. 47

Discussion

reduction effects

both extra- and intra-lexical factors
cumulative experience (frequency) factor and context

p. 48

The effect of the predictability of the s- word from the preceding word was small.
However, the effect of FFC [(frequency in a favourable context – how often does the word occur before a vowel conducive to reduction?)] was robust and confirms earlier findings that FFC encourages reduction in studies that did not control other probabilistic factors (Brown 2004, 2006). The effect of FFC indicates that the cumulative experience of words in reducing phonological contexts of non-high preceding vowels results in a greater likelihood of reduction than context of use alone can explain. The effect suggests that reduction of s- reflects changes in the lexical representations of words through cumulative experience with these words in reductive production contexts.

p. 49

The failure to find any robust effects of the non-contextual word and phone unit probabilities after controlling the contextual variables suggests that speakers are sensitive to how often a word occurs in environments that encourage reduction, but not measurably to non-contextual probabilistic measures of use. Consequently, an s- word’s frequency did not predict /s-/ reduction.

How can the failure to find a significant effect of word frequency on s-reduction in datasets analyzed be reconciled with other studies, in which word frequency effects on a range of reductive processes have been reported?

As noted, in most of these studies the likelihood of a word occurring in a reducing environment was not controlled. With respect to phonological context, the environments promoting reduction are generally identifiable, and tests of their importance could be readily made.
However, other variables not examined in this study may also promote reduction differentially across the word frequency range.
- For example, higher rates of speech are associated with reduction, and words may differ in their likelihood of being produced at high speech rates.
- In addition to a direct effect of speech rate on reduction, higher frequency words may, in particular, be more likely to be produced in contexts with higher speech rates than lower frequency words. Because faster speech rates may encourage reduction, high frequency words would thus have a higher probability of occurring in this reducing environment.

p. 109

Frequency, conservative gender systems and the language-learning child: Changing systems of pronominal reference in Dutch

Gunther De Vogelaer

p. 3

Summary

De Vogelaer studies the gender systems of Dutch dialects. More specifically, he starts out from the fact that Standard Dutch exhibits a gender mismatch of the binary article system and the ternary pronominal system and explores to what degree this historical change is affected by frequency effects. Results from a questionnaire study, in which subjects were put in a position to decide on the gender of nouns, indicate high- and low-frequency items behave differently: the former are affected in particular by standardization whereas the latter are influenced more by resemanticization. However, the study also cautions us that different types of data can yield very different results with regard to the effect of frequency. De Vogelaer compares frequency data from the 9-million-word Spoken Dutch Corpus to age-of-acquisition data from a target vocabulary list. Correlation coefficients indicate that the process of standardization is more correlated with the adult spoken corpus frequencies whereas resemanticization is more correlated with the age-of-acquisition data. As De Vogelaer puts it, ‘‘frequency effects are typically poly-interpretable,’’ and he rightly advises readers to regularly explore different frequency measures and register-specific frequencies.

p. 110

Introduction: change and variation in Dutch gender and frequency

Grammatical gender and pronominal gender

Present-day Standard Dutch differs from historical varieties of the language in that the difference between the marking of masculine and feminine gender is levelled out

Standard Dutch

has two definite articles
1. common de
2. neuter het
difference is further only distinguished in adjectival inflection in indefinite NPs
- e.g een mooi-e man/vrouw ‘a beautiful man/woman’ vs. een mooi kind ‘a beautiful child’

↳ mismatch!

three genders (reflected in pronouns), but not in morphology

↳ consequence

reshuffle of pronominal gender (especially in reference to inanimates)
👁 ↓

[W]hile pronominal gender traditionally matched the grammatical gender of the antecedent inanimate noun, northern varieties of Dutch, including Standard Dutch as spoken in the Netherlands, seem to be shifting towards a semantic system of pronominal gender, operating along the lines of the Individuation Hierarchy (Siemund 2002; Audring 2006, 2009): highly individuated nouns (including neuter words such as masker ‘mask’ and apparaat ‘device’; cf. Audring 2009: 86) increasingly trigger the use of masculine pronouns such as hij ‘he’ or hem ‘him’, weakly individuated ones (including common nouns such as spinazie ‘spinach’ and wol ‘wool’; cf. Audring 2009: 98) combine with neuter het ‘it’.

↓

Pronominal gender
highly individuated nouns	weakly individuated nouns
e.g. masker, apparaat	e.g. spinazie, wol
masculine pronouns	neuter pronouns

The pronominal gender stands completely separate from grammatical gender.

Flemish dialects

This chapter focuses on pronominal gender in a number of varieties of Dutch in which the grammatical gender system still stands strong, more specifically on West and East Flemish dialects. In these dialects any instances of semantically-motivated pronouns are highly ambiguous with respect to the mechanism of language change explaining them: these instances may exemplify ongoing change within these varieties, but they may also be adopted from varieties of Dutch in which semantic agreement occurs more often.

In addition, not all changes in the choice of a pronoun referring to an antecedent noun are due to resemanticisation. Apart from resemanticisation there is also variation in that many nouns have a different gender in the traditional dialects than in the standard language. In more recent times, extensive levelling is causing these dialects to converge to Standard Dutch, so it is likely that many nouns having a different gender in the dialect than in Standard Dutch are under pressure to switch gender.

p. 111

Frequency effects as explanators

frequency effects

may cast light on which part of the changes is explained by which mechanism of change

One wellknown hypothesis regarding frequency is that conservative features in language are preserved longer in high frequency items (see, e.g., Bybee and Hopper 2001: 17–18; Corbett, Hippisley, Brown, and Marriot 2001; Smith 2001)

Exact mechanism

Reason for conservation

According to Phillips (2006: 87), this characterisation holds for all changes that are implemented in cases ‘when memory fails’, for instance in sound changes affecting words of which the phonetic word form is not well entrenched in memory, which drives speakers to choose pronunciations motivated by surface phonetics, pronunciations analogous to other patterns in the language, or, in general terms, innovations requiring ‘‘access to generalisations that have emerged from word forms’’ (Phillips 2006: 157). Changes directly involving the production of word forms, however, affect the most frequent words first (e.g. deletion, assimilation, . . .).

Intra and extradialectal innovation

From the hypothesis that infrequent items are likely to be affected by innovations motivated by generalisations that have emerged from word forms, it follows that Phillips’ generalisation typically holds in situations where the innovation originates within a speech community.
Thus in situations of innovation diffusion through contact with other varieties, other regularities may be at work. At the moment there are contradictory opinions in the literature, however, as to whether dialect contact leads to change especially in high or low-frequency items. The most widely held opinion seems to be that exposure, and hence high frequency, increases the likelihood of change.

p. 112

The confusing duality of high frequency

high frequency = probability of change

increases a pattern’s salience
raises the hypothesis that high-frequency items are more likely to be involved in processes of (short and long term) accommodation (so change)

↕

high frequency = resistant to change

words learnt first should be very resistant to change
low frequency words first to change → they are less entrenched!

↓ reconciliation (Phillips 2006)

Milroy (2003)
ideologically motivated changes	ideologically free changes
ideologically motivated changes	high-frequency items	low-frequency items
typically affect words from a certain register (e.g. formal or rather informal vocabulary)	behave as changes emerging within a speech community
	changes directly involving production of word forms	changes being implemented ‘as memory fails’

(last distinction by Philips 2006 : 157)

Phillips (2006) reaches this conclusion in an inductive manner, by generalising over a large set of examples of language change. She does not, however, provide a principled account of why high-frequency items are more liable to change involving mere word forms, even though they are allegedly more entrenched in language users’ minds.

One principled reason could be that types of contact that do not affect high-frequency items are too weak to have any effect at all.
Alternatively, it could also be the case that effects are observed, but not on the community level.
- Thus, some low frequency items may be affected by the contact situation in the language of a number of individual language users, but these effects disappear if the data for individuals are pooled in larger data sets.
- It would take an experimental setting to verify whether such an account is plausible.

Goals

Explore to what extent frequency effects reveal which mechanisms of language change are observed in the gender system of West and East Flemish dialects.
Provide insight into which frequency data need to be used to obtain an optimal ‘fit’ between frequency and its role for language change.

Different mechanisms of language change are propelled by different people

Labov (2007)

p. 113

change independently originating within a certain variety

typically due to the imperfect transmission of language from one generation to another
dialect contact predominantly takes place between adults
⇒ pull from non-adult language use

Investigating gender in East and West Flemish dialects of Dutch

Gender in Dutch: the progressive north ⟷ the conservative south

The Dutch gender system has been undergoing change for centuries, thereby gradually decreasing the number of exponents of the grammatical three-gender system observed in the oldest documented varieties of the language: while Middle Dutch case inflection of articles, adjectives and nouns themselves revealed whether a given noun was masculine, feminine or neuter, present-day varieties of Dutch have dispensed with most of their adnominal morphology. Thus, case marking has gone and little gender agreement is left (cf. Geerts 1966). The processes of change have unevenly affected different varieties of Dutch. More particularly, they have resulted in massive geographical variation in the domain of gender marking at the level of the dialects (as described most recently in De Schutter et al. 2005), and also in smaller differences between varieties of the standard language.

p. 114

two-gender dialect	three-gender dialect
common and neuter gender	masculine, feminine, neuter gender

In correspondence with the conservative nature of their adnominal gender system, southern varieties of Dutch have by and large preserved the traditional system of pronominal reference: anaphoric pronouns may be masculine, feminine and neuter, and are chosen on the basis of a noun’s grammatical gender.

p. 115

Hence pronominal gender in these varieties differs from northern varieties of Dutch, especially in reference to inanimates, in that the vast majority of pronominal references in the south of the language area are still in line with the triadic distinction between masculine, feminine and neuter nouns (see Geeraerts 1992 for figures). This is no longer the case in areas where two-gender dialects of Dutch are spoken.

The pronominal genders of northern, progressive variants of Dutch
hij	zij	het
highly individuated words	female persons and animals	hardly individuated forms

Unlike for adnominal gender, where only two-gender systems are considered part of the standard language, little or no normative pressure exists to adopt a three- or a two-gender grammatical system for pronominal reference (see, e.g., Haeseryn et al. 2002: 161–162).

p. 116

resemanticisation

the development by which pronominal gender is reorganised in terms of individuation
can also be considered an instance of morphological regularisation, in which a system that has grown opaque is brought in line with a number of transparent rules (see also De Vos and De Vogelaer 2011)

p. 117

Thus the degree to which speakers engage in resemanticisation reflects the transparency of the masculine-feminine distinction in grammar, or, put differently, the frequency with which these speakers are exposed to nonstandard gender agreement markers unambiguously distinguishing masculine and feminine gender (see also Hoppenbrouwers 1983).

A complication: changing lexical gender

p. 118

Pauwels (1938) discusses the gender of a large number of nouns in Belgian Dutch dialects as documented in the late 19^th century, including many East and West Flemish dialects. It appears that all these dialects at the time had preserved the grammatical three-gender system, but there is a lot of variation on the lexical level: nouns that are masculine in one dialect may be feminine or neuter elsewhere. For instance, bos ‘forest’ is masculine in some dialects, but neuter in others; kraag ‘collar’ is feminine in some dialects, masculine in others, etc. Some nouns, like suiker ‘sugar’, can even be masculine, feminine, and neuter, depending on the dialect in which they are used. Since this variation has emerged in the history of Dutch, it appears that nouns may change gender in the course of history (see Geerts 1966 for examples).

Methodological preliminaries

methodology

questionnaires
fill in the blanks

p. 120

Mechanisms of gender change

The overall stability of Flemish gender

feminine gender

still referred to somewhat enough
evidence that three-way gender system has survived

p. 122

Standardisation effects

p. 123

dialect masculine hij > Standard neuter het

e.g. artikel
- expected dialectal pronoun: hij
- expected Standard Dutch pronoun: het

p. 124

Resemanticisation?

dialect / standard Dutch pronoun > resemanticised pronoun

It appears that in the Flemish dialects there is indeed a statistically significant e¤ect to use the neuter pronoun het ‘it’ to refer to mass nouns and abstracts, whether they are grammatically neuter or not: the ratio of het ‘it’ answers is higher for non-neuter mass nouns and abstracts than for non-neuter concrete count nouns: 16.3% (286/1752 answers) vs. 5.5% (98/1792); all nouns neuter in Standard Dutch have been kept out of the analysis. This effect is statistically significant (chi square = 108, d(f ) ¼ 1; p < .001; OR = 3.39).

Examples with stron gremanticisation

Examples of nouns from the questionnaire with a strong tendency towards resemanticisation, i.e. reference with het ‘it’, are:

achterdocht ‘suspicion’ with 42.5%
beet ‘bite’ 37.8%
pels ‘fur (mass noun)’ 24.6%
olie ‘oil’ 23.2% and
kalk ‘lime’ 21.7%

Examples with weak resemanticisation

peper ‘pepper’ 3.0%
chocolade ‘chocolate’ 3.0%.

As in Standard Dutch, resemanticisation seems to affect pronominal gender only (cf. similar tendencies in other Germanic varieties, as described by Siemund 2002 and Audring 2006). Quite surprisingly, as was already noted in section 3.1, no tendency is observed to extend masculine hij ‘he’ to all concrete count nouns.

The argument is made that this resemanticisation is a spontaneous development, and not a copy from northern Dutch.

Source of resemanticisation

Labov (2007): two types of language change
diffusion	imperfect transmission
change through (dialect) contact	change that is incrementally implemented by successive generations of language users
not applicable to resemanticisation	applicable to resemanticisation

↳ imperfect transmission

an innovative variant is gradually replacing the older variant, through a process of incrementation whereby each generation advances the relevant change beyond the level of the preceding generation

↓

source

children who are acquiring language

For the resemanticisation of Dutch pronominal gender, it is relevant that children appear to start from semantically motivated systems of pronominal gender, which are given up in favour of a grammatical system as they grow older.
According to De Houwer (1987), who investigates a child acquiring a southern variety of Standard Dutch, pronominal reference in three-year old children mainly operates on the basis of the animate-inanimate distinction: animate entities are referred to with hij ‘he’; for inanimate entities both hij ‘he’ and het ‘it’ are found. The motivation to use hij ‘he’ vis-a`-vis het ‘it’ remains unclear in De Houwer’s account, but given the [number] of deviations from the adult system, grammatical gender hardly plays a role.
At the age of 7, noun semantics are still the main factor underlying pronominal reference (De Vogelaer 2010). Not only the animate-inanimate distinction but also mass-count and concrete-abstract play a crucial role: both mass nouns and abstracts tend to trigger the use of the neuter pronoun het ‘it’ even when they are not grammatically neuter. Nevertheless, substantial proportions of pronominal reference are in line with grammatical gender (De Paepe and De Vogelaer 2008).
Significantly, during adolescence the semantically driven usage of het ‘it’ further decreases in favour of pronominal reference in line with grammatical gender, but even at the age of 18–20 the adolescents do not quite attain the same proportions of grammatical gender as previous generations (De Vos 2009; De Vos and De Vogelaer 2011).

p. 126

The arbitrariness of Dutch gender

These results on Dutch pronominal gender are all the more striking since in other languages in which pronouns agree in gender with their antecedent nouns, the grammatical system appears to be mastered already at a very young age, to the extent that deviations from grammatical gender are extremely rare. Thus, German children of six hardly deviate from a noun’s grammatical gender in pronominal reference (Mills 1986: 92), and the same holds for French-speaking children (Maillart 2003; Van der Velde 2003: 328, 340). This is likely due to the arbitrariness of the Dutch gender system: gender of nouns referring to inanimates is not motivated semantically in Dutch, nor are there any clues in the form of (monomorphemic) nouns that allow to determine gender (Durieux, Daelemans and Gillis 1999). Hence children acquiring Dutch can only derive nouns’ gender from the form of adnominal modifiers and pronouns, not from the form and/or meaning of the noun itself. This situation contrasts sharply with German and French, where gender assignment is at least partly motivated by semantic and/or formal regularities (see, e.g., Mills 1986 and Köpcke and Zubin 1996 on German, and Tucker, Lambert and Rigault 1977 on French). Such regularities minimise memory load, and are well-known to contribute to the acquirability of gender systems (Frigo and McDonald 1998; Gerken, Wilson and Lewis 2005).

More precisely, the acquisition of grammatical gender in pronominal reference should be conceived of as a process of ‘un-learning’ to use semantically motivated pronouns.

p. 126

Frequency effects

Frequency, and mechanisms of language change

p. 127

Theoretical reflections

Frequency effects recap

The role of frequency in linguistic change has been investigated extensively with respect to phonological change (see, e.g., Hooper 1976; Bybee 1995, 2001; Phillips 1984, 2001, 2006). The relevance of word frequency has been highlighted repeatedly, e.g. by Hooper (1976), who discusses two different frequency effects: on the one hand, processes of phonetic reduction are first visible in highly frequent items, whereas, on the other hand, processes of regularisation typically affect low-frequency items. In a survey of potential frequency effects in grammar, Bybee and Hopper (2001: 10–19) mention several types of frequency effects relating to language change, among which effects boiling down to a tendency in high-frequency patterns to engage in innovations (grammaticalization, lexicalization of multi-word-patterns, formal reduction, . . .), but also conservative effects in high-frequency patterns, such as the retention of certain morphological properties. Phillips (2006: 157) proposes that innovations implemented as speakers memory fails to provide the traditional variant typically affect low frequency items, whereas changes directly involving the production of word forms as stored in memory affect the most frequent words first.

Dialect contact and salience

In addition to playing a role in sound change and other processes of ‘regular’ linguistic change, frequency is found to play a role in dialect contact. Thus, Trudgill (1986: 11–21, 43–53) describes processes of long-term accommodation of one dialect towards another, and observes that salient features are adopted more easily.₄ It is rather obvious that, all other properties being equal, highly frequent features are more salient than infrequent features, and thus Trudgill’s observations lead to suggest that frequent items of the donor variety will be easily borrowed by the target variety. According to Phillips (2006: 141), however, such contact-induced changes will only affect high frequency items provided that there are no ‘ideological’ reasons for doing otherwise (cf. also Trudgill 1986: 17–19, 125 on ‘extra-strong salience’) and if the relevant change directly involves the production of the relevant word form.

₄ Dialect contact is understood here in a broad sense, i.e. as including contact between dialects and prestige varieties such as Standard Dutch.

p. 128

Nature of resemanticisation

Resemanticisation = ideologically free change
↳ also: directly influences the production of word forms

↓

likely frequent items first

language users are more likely to adopt a noun’s Standard Dutch gender the more they are exposed to the relevant noun

It should be noted, however, that this hypothesis is only valid for speakers for whom the dialect is their ‘base variety’, and for whom the standard language is their second dialect. While this situation is common across older speakers in the Dutch language area, recent investigations into dialect usage in younger generations (especially Rys 2007) have revealed that it is nowadays probably more accurate to consider Standard Dutch to be the native dialect of children growing up in Belgium, since the acquisition of the dialect primarily takes place in adolescence, and is characterised by overgeneralisations typically found in second dialect acquisition.

On the one hand, dialect geographical data indicated that resemanticisation in Flemish is likely not taken over from other varieties (De Vogelaer and De Sutter 2011).
- Hence resemanticisation can be characterised as a development taking place within a speech community, more specifically as a type of regularisation, a kind of innovation being implemented as speakers’ memories fail (Phillips 2006: 157).
- This characterisation leads to believe that the phenomenon will be found primarily in low-frequency items.
On the other hand, it remains at least theoretically possible that resemanticisation is diffused from northern Dutch on a word-by-word basis.
- Assuming that the liability to semantically motivated pronominalisation may be lexically specific (cf. Smith 2001: 365, 373–374 and Poplack 2001: 411–414 on the potential of morpho-syntactic traits to be lexically specific), and that patterns of semantic agreement may be diffused from one variety to another, this creates a potential for highly frequent items to trigger semantic agreement more easily.

p. 129

sources for frequency information

Belgian CGN [(Corpus Gesproken Nederlands)] word list
List Age of acquisition

p. 131

Processes and their informative word list
standardisation	resemanticisation
adult phenomenon	acquisition phenomenon
CGN list	AOA list

In a way, both the adoption of Standard Dutch gender and the acquisition of a noun’s grammatical gender (which makes the noun less susceptible to resemanticisation) can be described as learning processes. Hence it is expected that the influence of frequency on both phenomena is best described by means of a ‘learning curve’: the first instances of a Standard Dutch noun will contribute stronger to the standardisation process than any succeeding ones, whereas the first instances of a noun will also be more crucial for children to determine the noun’s grammatical gender during acquisition (cf. also Hay and Baayen 2002: 208, who observe that differences amongst lower frequencies often are more salient than equivalent differences amongst higher frequencies). Therefore, rather than testing for correlations between the observed changes and raw frequency data, a logarithmic transformation has been applied on the frequency data (which indeed yields better fits).

Dialect contact affects high frequency items

In order to investigate the role of frequency, for each word on the questionnaire the strength was calculated with which it is affected by each of the investigated tendencies. For instance, for the noun bos ‘forest’ 92 answers are available from regions where bos is traditionally a masculine noun, whereas it is neuter in Standard Dutch. In 74 cases, the neuter pronoun het ‘it’ was given as an answer. This means that bos ‘forest’ shows a standardisation ratio of 74/92 or 80%. This figure can then be correlated with the frequency data, i.e. both with (the logarithmic transformations of ) the noun’s score on Schaerlaekens, Kohnstamm, and Lejaegere’s (2000) Target Vocabulary List and the noun’s frequency in the Spoken Dutch Corpus.

p. 132

results

no correlations with target vocabulary list
CGN has borderline significant effect (r = .442, p = .057)
rank has better correlation

p. 133

From this it can be concluded that standardisation, at least in gender change, mainly affects highly frequent items: highly frequent items tend to shift towards Standard Dutch gender more easily.

Transmission and low frequency items

resemanticisation correlates negatively with frequency, i.e. infrequent items are affected more strongly by resemanticisation than items ranking high on the frequency lists
Target Vocabulary List would yield a stronger correlation than the raw frequency data from the Spoken Dutch Corpus

p. 134

↓

yes to both

Target List: r = -.579, p = .019
also good rank correlation

Both the table and the scatter plot indicate that items high on the target vocabulary list resist resemanticisation. The very same elements are believed to be acquired early and to be the most frequent items in young children’s speech (Vervoorn 1989: 40, 46; cf. section 4.1). The fact that the target vocabulary list yields much clearer results adds support to the idea that resemanticisation relates to the language acquisition process, providing an extra argument to consider it change through ‘imperfect transmission’.

p. 135

Significantly, the frequency data from the CGN do not correlate with resemanticisation. This may in part be due to the fact that the investigation only targeted a limited number of nouns, for which corpus frequency and target list score correlate less strongly than for most nouns.

p. 136

Watch out with the interpretation of “frequency”!

On the basis of the stronger correlations calculated by Vervoorn (1989: 64–65), it can be expected that large-scale investigations will reveal statistically significant correlations between resemanticisation and frequency data drawn from adults (such as CGN frequency). Indeed De Vos (2009) detects clear frequency effects with respect to the proportion of pronominal references in line with grammatical gender, using frequency data from adults rather than children. This, in turn, underscores the poly-interpretability of frequency effects: within the domain of diachronic research, frequency effects may reflect liability on a language pattern’s part to engage in processes of routinization (grammaticalization, phonetic reduction, . . .), different degrees of entrenchment in grammar, different ages of acquisition, etc. Hence researchers should be very explicit on the nature of frequency effects in their data, and on the underlying explanation. In many cases, frequency effects will merely reflect some deeper property of language patterns rather than being a conclusive explanation in their own right.

The data in this chapter are a case in point: during processes of standardisation, frequency effects reflect the intensity with which dialect speakers are exposed to nouns’ standard language gender; in resemanticisation, frequency effects reveal different ages at which nouns are acquired by children, which appears to influence the odds that these nouns’ grammatical gender can be learned successfully.

Conclusions

skipped

Computer simulations of language change notes